21 research outputs found

    Abstraction Raising in General-Purpose Compilers

    Get PDF

    PET-to-MLIR:A polyhedral front-end for MLIR

    Get PDF
    We present PET-to-MLIR, a new tool to enter the MLIR compiler framework from C source. The tool is based on the popular PET and ISL libraries for extracting and manipulating quasi-affine sets and relations, and Loop Tactics, a declarative optimizer. The use of PET brings advanced diagnosis and full support for C by relying on the Clang parser. ISL allows easy manipulation of the polyhedral representation and efficient code generation. Loop Tactics, on the other hand, enable us to detect computational motifs transparently and lift the entry point in MLIR, thus enabling domain-specific optimizations in general-purpose code.We demonstrate our tool using the Polybench/C benchmark suite and show that it can lower most of the benchmarks to the MLIR’s affine dialect successfully. We believe that our tool can benefit research in the compiler community by providing an automatic way to translate C code to the MLIR affine dialect

    SEER: Super-Optimization Explorer for HLS using E-graph Rewriting with MLIR

    Full text link
    High-level synthesis (HLS) is a process that automatically translates a software program in a high-level language into a low-level hardware description. However, the hardware designs produced by HLS tools still suffer from a significant performance gap compared to manual implementations. This is because the input HLS programs must still be written using hardware design principles. Existing techniques either leave the program source unchanged or perform a fixed sequence of source transformation passes, potentially missing opportunities to find the optimal design. We propose a super-optimization approach for HLS that automatically rewrites an arbitrary software program into efficient HLS code that can be used to generate an optimized hardware design. We developed a toolflow named SEER, based on the e-graph data structure, to efficiently explore equivalent implementations of a program at scale. SEER provides an extensible framework, orchestrating existing software compiler passes and hardware synthesis optimizers. Our work is the first attempt to exploit e-graph rewriting for large software compiler frameworks, such as MLIR. Across a set of open-source benchmarks, we show that SEER achieves up to 38x the performance within 1.4x the area of the original program. Via an Intel-provided case study, SEER demonstrates the potential to outperform manually optimized designs produced by hardware experts

    Transformations Déclaratives dans le ModÚle Polyédrique

    Get PDF
    Despite the availability of sophisticated automatic optimizers, performance-critical code sections are in practice still tuned by human experts. Pragma-based languages such as OpenMP or OpenACC are the standard interface to apply such transformations to large code bases and loop transformation pragmas would be a straightforward extension to provide fine-grained control over a compilers loop optimizer. However, the manual optimization of programs via explicit sequences of directives is unlikely to fully solve this problem as expressing complex optimization sequences explicitly results in difficult to read and non-performance-portable code. We address this problem by presenting a novel framework of composable program transformations based on the internal tree-like program representation of a polyhedral compiler. Based on a set of tree matchers and transformers, we describe an embedded transformation language which provides the foundation for the development of program optimization tactics. Using this language, we express core building blocks such as loop tiling, fusion, or data-layout-transformations, and compose them to higher-level transformations expressing algorithm-specific optimization strategies for stencils, dense linear-algebra, etc. We expect our approach to simplify the development of polyhedral optimizers and integration of polyhedral and syntactic approaches.MalgrĂ© l’existence d’outils sophistiquĂ©s d’optimisation automatique, les parties des programmes dont la performance est cruciale sont toujoursoptimisĂ©es manuellement par des humains experts. Les langages basĂ©s sur des directives “pragma”, tels que OpenMP ou OpenACC, sont une interface typique pour exprimer les transformations sur des grandes bases de code source. Telles directives pour transformer des nids de boucles seraient une extension naturelle permettant de contrĂŽler l’optimiseur de boucles d’une maniĂšre prĂ©cise. Pourtant l’optimisation manuelle des programmes Ă  travers les sĂ©quences des directives de transformation n’est pas toujours souhaitable car ces sĂ©quences longues et complexes produisent des programmes peu lisibles et ne bĂ©nĂ©ficiant pas de la portabilitĂ© de performance entre les diffĂ©rentes architectures matĂ©rielles. Nous proposons une nouvelle approche pour dĂ©finir les transformations composables des programmes basĂ©e sur la reprĂ©sentation interne d’un compilateur polyĂ©drique sous forme de l’arbre. GrĂące Ă  un ensemble des “motifs” et “transformateurs” des arbres, nous dĂ©crivons un langage de transformation sur lequel nous basons le dĂ©veloppement des tactiques d’optimisation. Avec ce langage, il est possible d’exprimer les transformations basiques, telles que le tuilage, la fusion ou la transposition de donnĂ©es, ainsi que la composition de ces transformations afin de dĂ©finir une stratĂ©gie d’optimisation pour les grandes classes des programmes, telles que les pochoirs, les contractions de tenseurs, etc. Notre approche pourrait simplifier le dĂ©veloppement des optimiseurs polyĂ©driques et l’intĂ©gration des transformations polyĂ©driques et syntaxique

    TDO-CIM: Transparent Detection and Offloading for Computation In-memory

    Get PDF
    Computation in-memory is a promising non-von Neumann approach aiming at completely diminishing the data transfer to and from the memory subsystem. Although a lot of architectures have been proposed, compiler support for such architectures is still lagging behind. In this paper, we close this gap by proposing an end-to-end compilation flow for in-memory computing based on the LLVM compiler infrastructure. Starting from sequential code, our approach automatically detects, optimizes, and offloads kernels suitable for in-memory acceleration. We demonstrate our compiler tool-flow on the PolyBench/C benchmark suite and evaluate the benefits of our proposed in-memory architecture simulated in Gem5 by comparing it with a state-of-the-art von Neumann architecture.Comment: Full version of DATE2020 publicatio

    Progressive Raising in Multi-level IR

    Get PDF
    International audienceMulti-level intermediate representations (IR) show great promise for lowering the design costs for domain-specific compilers by providing a reusable, extensible, and non-opinionated framework for expressing domain-specific and high-level abstractions directly in the IR. But, while such frameworks support the progressive lowering of high-level representations to low-level IR, they do not raise in the opposite direction. Thus, the entry point into the compilation pipeline defines the highest level of abstraction for all subsequent transformations, limiting the set of applicable optimizations, in particular for general-purpose languages that are not semantically rich enough to model the required abstractions. We propose Progressive Raising, a complementary approach to the progressive lowering in multi-level IRs that raises from lower to higher-level abstractions to leverage domain-specific transformations for low-level representations. We further introduce Multi-Level Tactics, our declarative approach for progressive raising, implemented on top of the MLIR framework, and demonstrate the progressive raising from affine loop nests specified in a general-purpose language to high-level linear algebra operations. Our raising paths leverage subsequent high-level domain-specific transformations with significant performance improvements

    TC-CIM: Empowering Tensor Comprehensions for Computing-In-Memory

    Get PDF
    International audienceMemristor-based, non-von-Neumann architectures performing tensor operations directly in memory are a promising approach to address the ever-increasing demand for energy-efficient, high-throughput hardware accelerators for Machine Learning (ML) inference. A major challenge for the programmability and exploitation of such Computing-In-Memory (CIM) architectures consists in the efficient mapping of tensor operations from high-level ML frameworks to fixed-function hardware blocks implementing in-memory computations. We demonstrate the programmability of memristor-based accelerators with TC-CIM, a fully-automatic, end-to-end compilation flow from Tensor Comprehensions, a mathematical notation for tensor operations, to fixed-function memristor-based hardware blocks. Operations suitable for acceleration are identified using Loop Tactics, a declarative framework to describe computational patterns in a poly-hedral representation. We evaluate our compilation flow on a system-level simulator based on Gem5, incorporating crossbar arrays of memristive devices. Our results show that TC-CIM reliably recognizes tensor operations commonly used in ML workloads across multiple benchmarks in order to offload these operations to the accelerator

    Mixed-species aggregations in arthropods

    Get PDF
    This review offers the first synthesis of the research on mixed-species groupings of arthropods and highlights the behavioural and evolutionary questions raised by such behaviour. Mixed-species groups are commonly found in mammals and birds. Such groups are also observed in a large range of arthropod taxa independent of their level of sociality. Several examples are presented to highlight the mechanisms underlying such groupings, particularly the evidence for phylogenetic proximity between members that promotes cross-species recognition. The advantages offered by such aggregates are described and discussed. These advantages can be attributed to the increase in group size and could be identical to those of non-mixed groupings, but competition-cooperation dynamics might also be involved, and such effects may differ between homo- and heterospecific groups. We discuss three extreme cases of interspecific recognition that are likely involved in mixed-species groups as vectors for cross-species aggregation: tolerance behaviour between two social species, one-way mechanism in which one species is attractive to others and two-way mechanism of mutual attraction. As shown in this review, the study of mixed-species groups offers biologists an interesting way to explore the frontiers of cooperation-competition, including the process of sympatric speciation.PostprintPeer reviewe

    Declarative Loop Tactics for Domain-specific Optimization

    No full text
    International audienceIncreasingly complex hardware makes the design of effective compilers difficult. To reduce this problem,we introduceDeclarative Loop Tactics, which is a novel framework of composable program transformationsbased on an internal tree-like program representation of a polyhedral compiler. The framework is based ona declarative C++ API built around easy-to-program matchers and builders, which provide the foundation todevelop loop optimization strategies. Using our matchers and builders, we express computational patternsand core building blocks, such as loop tiling, fusion, and data-layout transformations, and compose them intoalgorithm-specific optimizations. Declarative Loop Tactics (Loop Tactics for short) can be applied to manydomains. For two of them, stencils and linear algebra, we show how developers can express sophisticateddomain-specific optimizations as a set of composable transformations or calls to optimized libraries. By al-lowing developers to add highly customized optimizations for a given computational pattern, we expect ourapproach to reduce the need for DSLs and to extend the range of optimizations that can be performed by acurrent general-purpose compiler

    PET-to-MLIR: A polyhedral front-end for MLIR

    No full text
    We present PET-to-MLIR, a new tool to enter the MLIR compiler framework from C source. The tool is based on the popular PET and ISL libraries for extracting and manipulating quasi-affine sets and relations, and Loop Tactics, a declarative optimizer. The use of PET brings advanced diagnosis and full support for C by relying on the Clang parser. ISL allows easy manipulation of the polyhedral representation and efficient code generation. Loop Tactics, on the other hand, enable us to detect computational motifs transparently and lift the entry point in MLIR, thus enabling domain-specific optimizations in general-purpose code. We demonstrate our tool using the Polybench/C benchmark suite and show that it can lower most of the benchmarks to the MLIR’s affine dialect successfully. We believe that our tool can benefit research in the compiler community by providing an automatic way to translate C code to the MLIR affine dialect
    corecore